The Search Problem in Mixture Models

نویسندگان

  • Avik Ray
  • Joe Neeman
  • Sujay Sanghavi
  • Sanjay Shakkottai
چکیده

We consider the task of learning the parameters of a single component of a mixture model, for the case when we are given side information about that component; we call this the “search problem” in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain better sample complexity than existing standard mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real datasets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating Zone Pricing in a Location-Routing Problem Using a Variable Neighborhood Search Algorithm

In this paper, we assume a firm tries to determine the optimal price, vehicle route and location of the depot in each zone to maximise its profit. Therefore, in this paper zone pricing is studied which contributes to the literature of location-routing problems (LRP). Zone pricing is one of the most important pricing policies that are prevalently used by many companies. The proposed problem is v...

متن کامل

Using a new modified harmony search algorithm to solve multi-objective reactive power dispatch in deterministic and stochastic models

The optimal reactive power dispatch (ORPD) is a very important problem aspect of power system planning and is a highly nonlinear, non-convex optimization problem because consist of both continuous and discrete control variables. Since the power system has inherent uncertainty, hereby, this paper presents both of the deterministic and stochastic models for ORPD problem in multi objective and sin...

متن کامل

An Overview of the New Feature Selection Methods in Finite Mixture of Regression Models

Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...

متن کامل

The Negative Binomial Distribution Efficiency in Finite Mixture of Semi-parametric Generalized Linear Models

Introduction Selection the appropriate statistical model for the response variable is one of the most important problem in the finite mixture of generalized linear models. One of the distributions which it has a problem in a finite mixture of semi-parametric generalized statistical models, is the Poisson distribution. In this paper, to overcome over dispersion and computational burden, finite ...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Comparison of particle swarm optimization and tabu search algorithms for portfolio selection problem

Using Metaheuristics models and Evolutionary Algorithms for solving portfolio problem has been considered in recent years.In this study, by using particles swarm optimization and tabu search algorithms we  optimized two-sided risk measures . A standard exact penalty function transforms the considered portfolio selection problem into an equivalent unconstrained minimization problem. And in final...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1610.00843  شماره 

صفحات  -

تاریخ انتشار 2016